AITopics

2605.29272

Genre: Research Report (0.50)

Industry:

Law Enforcement & Public Safety > Fraud (1.00)
Banking & Finance (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.45)

arXiv.org Machine LearningMay-28-2026

The Fundamental Limits of Fraud Detection in Card Payment Networks

Dhama, Gaurav

Card payment fraud detection is usually framed as a supervised classification problem. Although this approach has generated practical progress, improvement has remained incremental despite major advances in model architecture. We argue that this is not mainly a failure of function approximation or optimization, but a consequence of structural information impairments inherent to the payment ecosystem. We formalize card authorization as a sequential decision problem with delayed, censored, corrupted, and counterfactually missing feedback. We derive a minimax regret lower bound showing that these impairments enter multiplicatively in the denominator of the achievable learning rate. The bound implies that improving issuer reporting quality or reducing censorship can yield larger reductions in the regret floor than increasing model complexity. We also show that heterogeneity across issuers worsens learnability beyond what average impairment rates suggest. The paper contributes a theory of why fraud detection in payment networks is fundamentally harder than in standard online learning settings, identifies ecosystem information quality as the key bottleneck, and provides a theoretical basis for prioritizing investments in reporting infrastructure, dispute process quality, and selective exploration. The paper is theory-first and does not rely on proprietary transaction data.

artificial intelligence, data mining, machine learning, (14 more...)

2605.27557

Genre: Research Report (0.40)

Industry:

Law > Civil Rights & Constitutional Law (1.00)
Law Enforcement & Public Safety > Fraud (1.00)
Banking & Finance (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Kuo, En-Ya, Motsch, Sebastien

EmDT: Embedding Diffusion Transformer for Tabular Data Generation in Fraud Detection

arXiv.org Machine LearningMar-17-2026

Imbalanced datasets pose a difficulty in fraud detection, as classifiers are often biased toward the majority class and perform poorly on rare fraudulent transactions. Synthetic data generation is therefore commonly used to mitigate this problem. In this work, we propose the Clustered Embedding Diffusion-Transformer (EmDT), a diffusion model designed to generate fraudulent samples. Our key innovation is to leverage UMAP clustering to identify distinct fraudulent patterns, and train a Transformer denoising network with sinusoidal positional embeddings to capture feature relationships throughout the diffusion process. Once the synthetic data has been generated, we employ a standard decision-tree-based classifier (e.g., XGBoost) for classification, as this type of model remains better suited to tabular datasets. Experiments on a credit card fraud detection dataset demonstrate that EmDT significantly improves downstream classification performance compared to existing oversampling and generative methods, while maintaining comparable privacy protection and preserving feature correlations present in the original data.

artificial intelligence, deep learning, machine learning, (18 more...)

2603.13566

Country:

North America > United States > Arizona > Maricopa County > Tempe (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report (1.00)

Industry: Law Enforcement & Public Safety > Fraud (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Neural Information Processing SystemsFeb-19-2026, 15:59:24 GMT

Realistic Synthetic Financial Transactions for Anti-Money Laundering Models Erik Altman

The UN estimates 2-5% of global GDP or $0.8 - $2.0 trillion dollars are laundered

data mining, machine learning, natural language, (21 more...)

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
(4 more...)

Genre: Research Report (0.68)

Industry:

Law Enforcement & Public Safety > Fraud (1.00)
Information Technology > Security & Privacy (1.00)
Government > Tax (1.00)
(4 more...)

Technology:

Information Technology > e-Commerce > Financial Technology (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(5 more...)

FOX NewsFeb-15-2026, 16:17:07 GMT

Why physical ID theft is harder to fix than credit card fraud

Identity theft involving stolen driver's licenses creates lasting legal exposure unlike credit card fraud, as license numbers cannot be changed and require extensive cleanup efforts.

artificial intelligence, fraud, social media, (12 more...)

FOX News

Country: North America > United States (1.00)

Industry:

Law Enforcement & Public Safety > Fraud (1.00)
Information Technology > Security & Privacy (1.00)
Banking & Finance (1.00)
(2 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence (0.96)
Information Technology > Security & Privacy (0.69)

Neural Information Processing SystemsFeb-11-2026, 07:51:44 GMT

acc1ec4a9c780006c9aafd595104816b-Supplemental-Datasets_and_Benchmarks.pdf

algorithm, dataset, outlier, (16 more...)

Country:

Asia > Singapore (0.04)
North America > United States > Virginia (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(3 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology (1.00)
Law Enforcement & Public Safety > Fraud (0.46)

Technology:

Information Technology > Software (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
(4 more...)

Neural Information Processing SystemsFeb-11-2026, 07:51:41 GMT

acc1ec4a9c780006c9aafd595104816b-Paper-Datasets_and_Benchmarks.pdf

algorithm, detection, outlier, (16 more...)

Country:

North America > United States > Virginia (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > Arizona (0.04)
Asia > Singapore (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology (0.93)
Law Enforcement & Public Safety > Fraud (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Lumadjeng, Adia, Birbil, Ilker, Acar, Erman

ECSEL: Explainable Classification via Signomial Equation Learning

arXiv.org Machine LearningJan-30-2026

We introduce ECSEL, an explainable classification method that learns formal expressions in the form of signomial equations, motivated by the observation that many symbolic regression benchmarks admit compact signomial structure. ECSEL directly constructs a structural, closed-form expression that serves as both a classifier and an explanation. On standard symbolic regression benchmarks, our method recovers a larger fraction of target equations than competing state-of-the-art approaches while requiring substantially less computation. Leveraging this efficiency, ECSEL achieves classification accuracy competitive with established machine learning models without sacrificing interpretability. Further, we show that ECSEL satisfies some desirable properties regarding global feature behavior, decision-boundary analysis, and local feature attributions. Experiments on benchmark datasets and two real-world case studies i.e., e-commerce and fraud detection, demonstrate that the learned equations expose dataset biases, support counterfactual reasoning, and yield actionable insights.

artificial intelligence, ecsel, machine learning, (17 more...)

2601.21789

Genre: Research Report > New Finding (0.68)

Industry:

Banking & Finance (1.00)
Health & Medicine > Therapeutic Area (0.93)
Law Enforcement & Public Safety > Fraud (0.88)
Information Technology > Services > e-Commerce Services (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Neural Information Processing SystemsDec-25-2025, 14:26:47 GMT

Realistic Synthetic Financial Transactions for Anti-Money Laundering Models

With the widespread digitization of finance and the increasing popularity of cryptocurrencies, the sophistication of fraud schemes devised by cybercriminals is growing. Money laundering -- the movement of illicit funds to conceal their origins -- can cross bank and national boundaries, producing complex transaction patterns. The UN estimates 2-5\% of global GDP or \$0.8 - \$2.0 trillion dollars are laundered globally each year. Unfortunately, real data to train machine learning models to detect laundering is generally not available, and previous synthetic data generators have had significant shortcomings. A realistic, standardized, publicly-available benchmark is needed for comparing models and for the advancement of the area.To this end, this paper contributes a synthetic financial transaction dataset generator and a set of synthetically generated AML (Anti-Money Laundering) datasets. We have calibrated this agent-based generator to match real transactions as closely as possible and made the datasets public. We describe the generator in detail and demonstrate how the datasets generated can help compare different machine learning models in terms of their AML abilities. In a key way, using synthetic data in these comparisons can be even better than using real data: the ground truth labels are complete, whilst many laundering transactions in real data are never detected.

anti-money laundering model, name change, realistic synthetic financial transaction, (4 more...)

Country: North America > United States (0.07)

Industry:

Banking & Finance (0.98)
Law Enforcement & Public Safety > Fraud (0.89)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsDec-25-2025, 10:52:17 GMT

Turning the Tables: Biased, Imbalanced, Dynamic Tabular Datasets for ML Evaluation

Evaluating new techniques on realistic datasets plays a crucial role in the development of ML research and its broader adoption by practitioners. In recent years, there has been a significant increase of publicly available unstructured data resources for computer vision and NLP tasks. However, tabular data -- which is prevalent in many high-stakes domains -- has been lagging behind. To bridge this gap, we present Bank Account Fraud (BAF), the first publicly available 1 privacy-preserving, large-scale, realistic suite of tabular datasets. The suite was generated by applying state-of-the-art tabular data generation techniques on an anonymized,real-world bank account opening fraud detection dataset. This setting carries a set of challenges that are commonplace in real-world applications, including temporal dynamics and significant class imbalance. Additionally, to allow practitioners to stress test both performance and fairness of ML methods, each dataset variant of BAF contains specific types of data bias. With this resource, we aim to provide the research community with a more realistic, complete, and robust test bed to evaluate novel and existing methods.

dynamic tabular dataset, imbalanced, name change, (4 more...)

Industry: Law Enforcement & Public Safety > Fraud (0.60)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.40)